Using Visualization and Automation to Accelerate Genetics Discovery

نویسندگان

  • Ross Eugene Curtis
  • Kathryn Roeder
  • Daniel Weeks
  • Sally Wenzel
  • Eric P Xing
چکیده

The last ten years since the completion of the human genomic sequencing project have seen huge advances in the understanding of the genetic basis of human disease. Understanding the genes involved in disease and the causal genomic polymorphisms involved holds the promise of better treatment and prevention of disease. Much of the recent progress has been made through the use of the popular genome-wide association study (GWAS). However, despite the success of GWAS, its findings often fail to explain the full heritability of a disease, or the findings include SNPs that affect a disease through some unknown biological mechanism. The incorporation of gene expression or clinical trait data into GWAS is one approach that can further elucidate the mechanisms behind SNP-disease associations. These so-called intermediate phenotypes have inherent structures, such as correlations and interactions, which can be leveraged to facilitate discovery. The promise of these data has motivated a new generation of GWAS algorithms, termed structured association mapping, which use cutting-edge machine learning techniques to fully leverage structures in the data to uncover associations between the genome, transcriptome, and phenome. However, the increasing amounts of data used in GWAS, and the complexity of the methods used to analyze the data, demand a new integrative approach to genetics discovery. To fully capture the potential available in today’s genetic data, we must rely on the strengths of machines and people. With this in mind, I have developed a visual analytics software system called GenAMap. GenAMap has been built to automate the execution of structured association mapping algorithms, making them available to genetics analysts. Through GenAMap, I introduce new visualizations that are built to enable analysts to explore the structure of genetic data while considering genomic associations. Through the integration of the strengths in the machine learning, visualization, and genetics fields, I show that GenAMap has the potential to facilitate and advance the progress of genetics discovery through the analysis of human asthma, yeast, and mouse datasets. In this work I also demonstrate the integration of visualization and machine learning to another domain in genetics research: the study of dynamic genetic networks. I present TVNViewer, an online visualization tool for exploring these networks, and use a yeast and breast cancer dataset to show how the visualizations in TVNViewer enable the analysis and exploration of the networks as they change across time and space. In the genetics world where the amount of available data continues to grow, the integration of visualization and machine learning techniques has the potential to accelerate advancement in genetics discovery.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Designing an Ontology for Knowledge Discovery in Iran’s Vaccine

Ontology is a requirement engineering product and the key to knowledge discovery. It includes the terminology to describe a set of facts, assumptions, and relations with which the detailed meanings of vocabularies among communities can be determined. This is a qualitative content analysis research. This study has made use of ontology for the first time to discover the knowledge of vaccine in Ir...

متن کامل

Identification of Prognostic Genes in Her2-enriched Breast Cancer by Gene Co-Expression Net-work Analysis

Introduction: HER2-enriched subtype of breast cancer has a worse prognosis than luminal subtypes. Recently, the discovery of targeted therapies in other groups of breast cancer has increased patient survival. The aim of this study was to identify genes that affect the overall survival of this group of patients based on a systems biology approach. Methods: Gene expression data and clinical infor...

متن کامل

Exploiting Visualization in Knowledge Discovery

Todate visualization has not been extensively harnessed in knowledge discovery in databases (KDD). In this paper, we show that a multidimensional visualization (MDV) technique can be used synergistically with a machine learning program like C4.5 to uncover new knowledge. Used together, the two approaches span the KDD spectrum between complete automation on one hand and fully manual on the other...

متن کامل

Drug Discovery Acceleration Using Digital Microfluidic Biochip Architecture and Computer-aided-design Flow

A Digital Microfluidic Biochip (DMFB) offers a promising platform for medical diagnostics, DNA sequencing, Polymerase Chain Reaction (PCR), and drug discovery and development. Conventional Drug discovery procedures require timely and costly manned experiments with a high degree of human errors with no guarantee of success. On the other hand, DMFB can be a great solution for miniaturization, int...

متن کامل

Collaborative Information Analysis & Visualization for Knowledge Discovery

This paper describes the methods and an application architecture for knowledge discovery using collaborative information analysis and visualization. First, the challenges that motivate this research are outlined. The requirements on a solution are then described. Next we outline a solution method. Architecture examples are provided to help illustrate how automation support is provided for the a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011